Regulatory Repositories

Coline Zeballos Roche

Yann Féat mainanalytics

A Universal Conundrum

There are n packages for x, which one is the best?1

A Universal Conundrum

By choosing packages, we’re choosing our 1

  • Feature set
  • Dependency footprint
  • Integration with other packages
  • Preferred lifecycle management of our tools
  • Community that we can lean on for help

A Universal Conundrum

Regulated Industries: Justification as a Requirement

Goals

  • Provide a community-maintained catalog of package quality indicators (“risk metrics”)
  • Serve quality indicators in a standard format
  • Thoroughly document the system used to perform quality assessment
  • Demonstrate how regulatory-ready risk assessments can be provided using public quality indicators
  • Serve subsets of packages that conform to a specified risk tolerance
  • Improve transparency of industry R package adoption, endorsement and regulator interaction

An evolving R ecosystem

  • (NOTE: show interaction between CRAN, RVH Reg R Repo (us), RC Submissions WG, RC Repositories WG, pharmaverse, other?)

Stage 0 overview

Interacting with the repo

Packages risk filters

  • Helper package for system administrators
  • Restricts packages available for installation to those fitting in a policy
  • Uses packages metadata in the repo
  • May be used together with manual checks (e.g., read a statistical review)

flowchart TD
  A[All packages] --> B{Code Coverage\n > 95%?}
  B --> C{Documentation \n available?}
  C --> D(Available for safety-critical activities)

Usage

Unfiltered

available.packages()
Package
1 colorspace
2 farver
3 isoband
106 tripack

Filtered

fltr <- risk_filter(covr_coverage > 0.95
  && has_vignettes)
options(available_packages_filters = fltr)
available.packages()
Package
1 colorspace
2 magrittr
3 R6
32 shinyjs

Repository ‘back-end’

Repository forked from r-hub/repos

Mandatory package metadata for a package repository

Package: bslib
Version: 0.6.1
Depends: R (>= 2.10), R (>= 4.4), R (< 4.4.99)
License: MIT + file LICENSE
Built: R 4.4.0; ; 2023-11-29 16:39:06 UTC; unix
RVersion: 4.4
Platform: x86_64-pc-linux-gnu-ubuntu-22.04
Imports: base64enc, cachem, grDevices, htmltools (>= 0.5.7), jquerylib (>= 0.1.3),
         jsonlite, lifecycle, memoise (>= 2.0.1), mime, rlang, sass (>= 0.4.0)
...

Added fields for risk-based assessment

riskmetric_run_date: 2023-06-21
riskmetric_version: 0.2.1
pkg_score: 0.291481580696657
covr_coverage: 0.852820470987098
has_vignettes: 1
has_news: 1
reverse_dependencies: 0.900122893005357
dependencies: 0.0474258731775666
remote_checks: 0.846153846153846
has_maintainer: 1
...

Packages cohort validation

  • Risk metrics calculated on packages with new versions and on their reverse dependencies
  • Uses the GitHub API to fetch new release assets
package version ver_old pkg_score has_news
bslib 0.7.0 0.6.1 0.4998 1
dbplyr 2.5.0 2.4.0 0.4668 1
htmltools 0.5.8.1 0.5.7 0.4811 1

Our roadmap

Reference container image(s)

  • Should mimic environments of companies and health authority reviewers
  • Integrates with most modern analytic workbench tools and an evaluation pipeline
  • To be used by the Regulatory R Repository for packages cohort validation

What’s next

  • (NOTE: STAGE 1: Pipeline Integration)

Closing

Join us

  • (NOTE: link to GH join us issue, add R Consortium info)

Thank you

  • (NOTE: list of Core team members)